Canonical Variate Analysisv (CVA) biplot

Aim: Dimension reduction technique that maximises variation between classes while minimising within class variation.

This is achieved by the following tasks:

  • Decomposing Variance
  • Find a linear mapping to canonical space.
  • Find a low dimensional approximation

Variance Decomposition

The classical variance decomposition \[\mathbf{T}=\mathbf{B}+\mathbf{W},\]

has as an analogy in this setting \[ \mathbf{X'X} = \mathbf{\bar{\mathbf{X}}'C \bar{\mathbf{X}}} + \mathbf{X' [I - G(G'G)^{-1}C(G'G)^{-1}G'] X}. \]

The choice of \(\mathbf{C}\) determines the variant of CVA:

  • Weighted: \(\mathbf{C}=\mathbf{N}=\mathbf{G'G}\)
  • Unweighted: \(\mathbf{C}=\mathbf{I}_G - G^{-1}\mathbf{1}_G\mathbf{1}_G'\)
  • Unweighted \w weighted centroid: \(\mathbf{C}=\mathbf{I}_G\)

Linear Mapping

Find a linear mapping

\[\mathbf{Y}=\mathbf{X}\mathbf{M}, \tag{1}\]

such that \[\frac{\mathbf{m}'\mathbf{B}\mathbf{m}}{\mathbf{m}'\mathbf{W}\mathbf{m}} \tag{2}\] is maximised s.t. \(\mathbf{m}'\mathbf{W}\mathbf{m}=1\).

It can be shown that this leads to the following equivalent eigen equations:

\[ \mathbf{W}^{-1}\mathbf{BM} = \mathbf{M \Lambda} \tag{3} \]

\[ \mathbf{BM} = \mathbf{WM \Lambda} \tag{4} \]

\[ (\mathbf{W}^{-\frac{1}{2}} \mathbf{B} \mathbf{W}^{-\frac{1}{2}}) \mathbf{M} = (\mathbf{W}^{-\frac{1}{2}} \mathbf{M}) \mathbf{\Lambda} \tag{5} \]

with \(\mathbf{M'BM}= \mathbf{\Lambda}\) and \(\mathbf{M'WM}= \mathbf{I}\).

Since the matrix \(\mathbf{W}^{-\frac{1}{2}} \mathbf{B} \mathbf{W}^{-\frac{1}{2}}\) is symmetric and positive semi-definite the eigenvalues in the matrix \(\mathbf{\Lambda}\) are positive and ordered. The rank of \(\mathbf{B} = min(p, G-1)\) so that only the first \(rank(\mathbf{B})\) eigenvalues are non-zero. We form the canonical variates with the transformation

\[ \bar{\mathbf{Y}} = \bar{\mathbf{X}}\mathbf{M}.\tag{5} \]

Low dimensional approximation

The first two canonical variates are given by:

\[\mathbf{\bar{Z}}=\mathbf{\bar{Y}}\mathbf{J}_2=\mathbf{\bar{X}}\mathbf{M}\mathbf{J}_2 \tag{6}\] where \(\mathbf{J'}_2=[\mathbf{I}_2 \quad \mathbf{0}]\). We add the individual sample points with the same transformation \[\mathbf{Z}=\mathbf{X}\mathbf{M}\mathbf{J}_2. \tag{7}\]

A new sample point, \(\mathbf{x}^*\), can be added by interpolation \[\mathbf{z}^*=\mathbf{x}^*\mathbf{M}\mathbf{J}_2.\tag{8}\]

CVA function

CVA()
Argument Description
bp Object of class biplot.
classes Vector of class membership. User specified, otherwise defaults to vector specified in biplot.
dim.biplot Dimension of the biplot. Only values 1, 2 and 3 are accepted, with default 2.
e.vects Which eigenvectors (principal components) to extract, with default 1:dim.biplot.
weightedCVA “weighted” or “unweightedCent” or “unweightedI”: Controls which type of CVA to perform, with default "weighted"
show.class.means TRUE or FALSE: Controls whether class means are plotted, with default TRUE.
low.dim "sample.opt" or "Bhattacharyya.dist": Controls method of constructing additional dimension(s) if dim.biplot is greater than the number of classes, with default "sample.opt".

The function fit.measures()

bp <- biplot(state.x77) |> CVA(classes = state.region) |> fit.measures()

Contains the following information on how well the biplot represents the information of the original and canonical space:

  • quality: Quality of fit for canonical and original variables
  • adequacy: Adequacy of original variables
  • axis.predictivity: Axis predictivity
  • within.class.axis.predictivity: Class predictivity
  • within.class.sample.predictivity: Sample predictivity

The function summary()

The summary() function prints to screen the fit.measures stored in the object of class biplot.

bp |> summary()
# Object of class biplot, based on 50 samples and 8 variables.
# 8 numeric variables.
# 4 classes: Northeast South North Central West 
# 
# Quality of fit of canonical variables in 2 dimension(s) = 91.9% 
# Quality of fit of original variables in 2 dimension(s) = 93.4% 
# Adequacy of variables in 2 dimension(s):
#  Population      Income  Illiteracy    Life Exp      Murder     HS Grad 
# 0.453533269 0.105327455 0.107221535 0.002201286 0.208653101 0.687840023 
#       Frost        Area 
# 0.452308013 0.118544323 
# Axis predictivity in 2 dimension(s):
# Population     Income Illiteracy   Life Exp     Murder    HS Grad      Frost 
#  0.9873763  0.9848608  0.8757913  0.9050208  0.9955088  0.9970346  0.9558192 
#       Area 
#  0.9344651 
# Class predictivity in 2 dimension(s):
#     Northeast         South North Central          West 
#     0.8031465     0.9985089     0.6449906     0.9988469 
# Within class axis predictivity in 2 dimension(s):
# Population     Income Illiteracy   Life Exp     Murder    HS Grad      Frost 
# 0.02246821 0.10349948 0.27870637 0.21460313 0.29836047 0.87510975 0.22320989 
#       Area 
# 0.13603927 
# Within class sample predictivity in 2 dimension(s):
#        Alabama         Alaska        Arizona       Arkansas     California 
#    0.769417280    0.174566384    0.328610375    0.148035077    0.103141908 
#       Colorado    Connecticut       Delaware        Florida        Georgia 
#    0.357627854    0.079176621    0.438089663    0.327270922    0.558038750 
#         Hawaii          Idaho       Illinois        Indiana           Iowa 
#    0.029173037    0.167543892    0.076948041    0.473148418    0.592667777 
#         Kansas       Kentucky      Louisiana          Maine       Maryland 
#    0.774719240    0.439306768    0.190654770    0.086183357    0.284829878 
#  Massachusetts       Michigan      Minnesota    Mississippi       Missouri 
#    0.428103056    0.188094295    0.644844800    0.163103449    0.719255739 
#        Montana       Nebraska         Nevada  New Hampshire     New Jersey 
#    0.239142302    0.671350698    0.015766988    0.386053551    0.207503850 
#     New Mexico       New York North Carolina   North Dakota           Ohio 
#    0.012872885    0.008101305    0.872322617    0.457852394    0.092634247 
#       Oklahoma         Oregon   Pennsylvania   Rhode Island South Carolina 
#    0.561156131    0.158926944    0.261838286    0.482912999    0.229047767 
#   South Dakota      Tennessee          Texas           Utah        Vermont 
#    0.095865021    0.237667483    0.121494852    0.349495632    0.256983459 
#       Virginia     Washington  West Virginia      Wisconsin        Wyoming 
#    0.453608981    0.044780371    0.346223950    0.544998639    0.174849092

The functions rotate()

The rotate() function rotates the samples and axes in the biplot by rotate.degrees degrees.

par(mfrow=c(1,2))
bp |> plot()
bp |> rotate(rotate.degrees=90)|> plot()

The functions reflect()

The reflect() function reflects the samples and axes in the biplot along an axis, x(horisontal reflection), y (vertical reflection) or xy (diagonal reflection).

par(mfrow=c(1,2))
bp |> plot()
bp |> reflect(reflect.axis ="y")|> plot()

The argument zoom=TRUE in plot()

The argument zoom= is FALSE by default. If zoom=TRUE a new graphical device is launched. The user is prompted to click on the desired upper left hand and lower right hand corners of the zoomed in plot.

bp |>  plot(zoom=TRUE)

1 Dimensional biplot CVA of state.x77 data

biplot(state.x77,classes=state.region) |> CVA(dim.biplot=1) |> classify() |> plot()

1 Dimensional biplot PCA of iris data

biplot(iris,group.aes=iris$Species) |> PCA(dim.biplot=1) |> density1D() |> ellipses() |> plot()
# Computing 1.96 -ellipse for setosa 
# Computing 1.96 -ellipse for versicolor 
# Computing 1.96 -ellipse for virginica